Skip to content

Conversation

@shroffk
Copy link
Collaborator

@shroffk shroffk commented Oct 7, 2025

Break very large index and update request into smaller chunks
Address the issue #186

@jacomago
Copy link
Contributor

jacomago commented Oct 8, 2025

Oh, you're ahead of me #188

@shroffk
Copy link
Collaborator Author

shroffk commented Oct 8, 2025

An anomaly for sure...
I did make the process parallel too, it might be overkill.

@jacomago
Copy link
Contributor

jacomago commented Oct 8, 2025

An anomaly for sure... I did make the process parallel too, it might be overkill.

Yes, not sure about the parallelization. Nor using skip.

I think some combination of the two would be good. I forgot to use @value for instance

@shroffk
Copy link
Collaborator Author

shroffk commented Oct 8, 2025

Well I was trying to find the simplest way to not make copies and also support multi threading
What is your primary concern with the skip

@jacomago
Copy link
Contributor

jacomago commented Oct 9, 2025

Well I was trying to find the simplest way to not make copies and also support multi threading What is your primary concern with the skip

I just think then you have the extra loop. Wheras mine just does on loop still. But I'm not sure if it matters much.

@shroffk
Copy link
Collaborator Author

shroffk commented Oct 9, 2025

well I am hoping that the multi threaded bit means that when you have large number of channels being indexed
( like populating for performance testing, or like in ESS's case where the whole CF in wiped out an recreated ) then doing it in chunks sequentially might have some performance issues.

@shroffk shroffk closed this Oct 9, 2025
@shroffk shroffk reopened this Oct 9, 2025
@shroffk
Copy link
Collaborator Author

shroffk commented Oct 13, 2025

@tynanford do you want to share your opinion

@tynanford
Copy link
Contributor

tagging @conorschofield as well

should index.chunk.size and processors.chunking.size have the same default of 10K?

I don't know enough java to have an opinion on skip or how to structure the loops. We also do the same as ESS and re-populate the entire CF instance every so often. All the CF data is stored in IOCs or in a matlab script which adds MML meta-data to CF . So parallelization sounds good to me.

@jacomago
Copy link
Contributor

I'm wondering if some of the tests are breaking because there is no longer a consistent ordering of the returned channels.

@jacomago
Copy link
Contributor

should index.chunk.size and processors.chunking.size have the same default of 10K?

I based it on the default elastic window size. 10 000 seems to be fine most of the time anyway. We hit the limit at a 170 000 IOC, I calculated that 110 000 is around where the problem is so I think 10 000 is a good default.

@github-actions
Copy link

Overall Project 1.21% -2.93%
Files changed 0%

File Coverage
ChannelProcessorService.java 0% -33.98%
ChannelRepository.java 0% -25.33%

@github-actions
Copy link

Overall Project 1.21% -2.95%
Files changed 0%

File Coverage
ChannelProcessorService.java 0% -33.98%
ChannelRepository.java 0% -25.5%

@shroffk
Copy link
Collaborator Author

shroffk commented Oct 14, 2025

I'm wondering if some of the tests are breaking because there is no longer a consistent ordering of the returned channels.

I don't think that is the case with the manual IT tests.

Maybe we can have one preference for both/all the chunking operations.

@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

@github-actions
Copy link

Overall Project 1.21% -2.96%
Files changed 0%

File Coverage
ChannelProcessorService.java 0% -33.98%
ChannelRepository.java 0% -25.58%

@shroffk shroffk merged commit 1c7dcfb into master Oct 16, 2025
6 of 7 checks passed
@jacomago jacomago deleted the chunking branch October 17, 2025 06:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants